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Image recording apparatus and method using light fields to track position and orientation 



(57) The invention provides a simple method and 
apparatus for tracking image recording device (110) mo- 
tion using a light field. The invention locates the image 
recording device's position and orientation in each 
frame very precisely by checking the radiance seen 



along lines captured in previous frames. The invention 
provides an interactive system that provides the opera- 
tor with feedback, to capture a sequence of frames that 
sufficiently cover the light field, to provide sufficient data 
for reconstruction of three-dimensional structures. 
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Description 

[0001] The invention relates to an apparatus and 
method for image-based rendering. In particular, the in- 
vention relates to an apparatus and method for tracking 
camera motion using a light field in order to provide data 
for later reconstruction of a three-dimensional structure 
or environment. 

[0002] The computer graphics industry has become 
increasingly interested in methods by which three-di- 
mensional objects or environments can be represented 
using a large collection of two-dimensional images. This 
is referred to as image-based rendering. One way in 
which to represent an object or environment is to use 
light fields. 

[0003] A light field is any representation of a three- 
dimensional scene by the radiance induced on the set 
of incident lines in three-dimensional free space R 3 . This 
includes, but is not limited to, representations which 
sample and store this radiance as a collection of two- 
dimensional images and representations which sample 
and store this radiance in a data structure representing 
lines in three-dimensional free space R 3 . 
[0004] For example, If the two-dimensional images of 
the object or environment are sampled densely at a two- 
dimensional set P of camera positions, the radiance 
seen along all the lines passing through P has been 
sampled. These radiances can be stored, for example, 
in a four-dimensional array where each element corre- 
sponds to a line in three<iimensional free space R 3 . Any 
image of the object from a three-dimensional set of po- 
sitions can be reconstructed by collecting the appropri- 
ate lines from this array. Such a representation is re- 
ferred to as a light field. 

[0005] A light field is best understood with reference 
to Figure 1 and the following explanation. Figure 1 
shows a light slab representation of a light field. Consid- 
er the set of lines passing through two parallel planes 
and P 2 in the three-dimensional free space R 3 . Each 
pair of points p 1 of the plane P, and p 2 of the plane P 2 
defines a unique line. Each line, except for those parallel 
to P, and P 2 , defines a unique pair of the points and 
p 2 . So the set of all lines in the three-dimensional free 
space R 3 is a four-dimensional space, which can be pa- 
rameterized by the four coordinates (u.v.s.t) required to 
specify the points p., and p 2 . 

[0006] The lines through a specific point p in R 3 form 
a two-dimensional subset, which is a plane under this 
parameterization. An image is a rectangular subset of 
this "plane of lines" with p as the focal point. 
[0007] Light fields such as these can be used to re- 
construct images by collecting a subset of the four-di- 
mensional space of lines which contain a lot of images 
of an object. This collection is done by pointing an image 
recording device at the object and physically scanning 
the image recording device across a two-dimensional 
square. A two-dimensional set of lines is collected at 
each image recording device position The lines in all 



the images are parameterized using two parallel planes 
(one contain ing the moving camera and the other in front 
of the object) and stored as a four-dimensional array 
The radiance seen along each line in the four-dimen- 

5 sional array is addressed by the line coordinates. An im- 
age of the obiect with the focal point anywhere in the 
three-dimensional vicinity of the two planes can be ex- 
tracted by collecting the radiance of each of its lines from 
this array. In this way, images can be reconstructed in 

*0 real time. 

[0008] These light fields can be compressed to a rea- 
sonable size and accessed quickly enough to make 
them useful for real-time rendering on a high-end ma- 
chine. However, the capturing of these light fields has 
*s been limited to mechanically scanning a camera over a 
two-dimensional plane using such devices as a compu- 
ter controlled camera gantry. Such a device is disclosed 
in Levoy et aL "Light Field Rendering", Computer 
Graphics Proceedings. SIGGRAPH '96, p. 528, 31-42. 
zo In these devices, the computer keeps track of the cam- 
era position at each frame of the light field capturing 
process. Such devices are generally expensive, limited 
to a specific area and range of object sizes it can handle, 
and require large investments of time and money to pro- 
25 duce and use. 

[0009] Another way in which to capture the light fields 
is to use fiducial points, points in three-dimensional free 
space R 3 whose exact locations are known with respect 
to some coordinate frame. Gortler et al., "The Lumi- 
30 graph", Computer Graphics Proceedings, SIGGRAPH 
'96, p. 528, 43-54, discloses one method of using fidu- 
cial points to obtain images of a three-dimensional en- 
vironment. In this method, however, it is necessary to 
have many fiducial points within the environment and to 
maintain those fiducial points in the image field while 
capturing each image. Thus, the range of motion of the 
camera is limited to the area in which a number of fidu- 
cial points are present. 

[0010] Therefore, it would be beneficial and more 
practical to be able to use a hand-held camera to per- 
form the image capturing of a three-dimensional envi- 
ronment without being constrained to a particular area 
of movement, i.e. an area containing fiducial points. 
However, a fundamental difficulty in using a hand-held 
device is in tracking the position and orientation of the 
camera at each frame of the image capturing process. 
[0011] Camera tracking is a problem that arises fre- 
quently in computer vision. One technique for camera 
tracking is to measure the optical flow from one image 
frame to the next. In this technique, each pixel is as- 
sumed to correspond to a nearby pixel with the best- 
matching color. An overall combination of the pixel mo- 
tions is interpreted as a motion of the camera. This 
method provides good results when tracking the motion 
between two successive image frames. However, for 
multiple frames, such as is needed for capturing of en- 
vironments, the error in the camera position accumu- 
lates rapidly 
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[0012] Another technique for camera tracking is the 
point correspondence method, in which distinctive look- 
ing feature points (such as object corners) are extracted 
from an image and tracked from one image frame to the 
next These points act as fiducial points during camera 
tracking This method provides better results than the 
optical flow method, however, it is difficult to track the 
same three-dimensional points from image frame to im- 
age frame and the method itself is rather slow and cum- 
bersome 

[001 3] Additionally, problems with camera tracking for 
the capture of images of a three<iimensional environ- 
ment for later reconstruction of the environment is dif- 
lercni from the problem of capturing data from arbitrary 
video sequences. The camera operator knows that she 
o( tic is trying to capture data ot the environment, and 
il. wii mc to move the camera in particular ways in order 

10 do so Thus, an interactive data collection system 
ccutci provide feedback to the camera operator, giving 
it r 1 u;>oMior indications for keeping the camera tracking 
tut^u^i mid to fill in desired data. 

(0014) The invention provides a method and appara- 
tu > tot lucking the motion of an image recording device. 
T h..- method and apparatus allow a hand-held image re- 

1 1 -in* iq device to be used. 

(001 5| The invention also provides a simple method 
.met -ippnratus for tracking image recording device mo- 
w*\ u-.ir g a tight field. 

|0016| The invention further provides an interactive 
*.v'-t« inat provides the operator with feedback to cap- 
tion i sequence of frames that sufficiently cover the light 

Im i.! 

1 001 7| The invention additionally provides an interac- 
ts, v .i-nr that provides the operator with feedback to 
i .< v= I.- sulticient data for reconstruction of three-dimen- 

\ttuctures. 

|0018| The method and apparatus of the invention lo- 
t . ttM- position and the orientation of the image re- 
i t ..ij> > j device in each frame by checking the radiance 
,ii -i } in cs m the frame and corresponding lines in pre- 
via* it.imcs. Thus, a separate device to keep track of 
it io luc.rion of the image recording device is not neces- 

|0019| I he method and apparatus of the invention lo- 
cket, the image recording device's position and orien- 
tation in each frame very precisely by checking the ra- 
diance scon along lines captured in previous frames. 
[0020] These and other features and advantages of 
It Mb invention are described in or are apparent from the 
following detailed description of the preferred embodi- 
ments 

[0021] The preferred embodiments of this invention 
will be described in detail, with reference to the following 
figures wherein: 

Fig 1 shows a light slab representation of a light 
field 

Fig 2 shows a block diagram of the apparatus of 



the invention; 

Fig 3 is a diagram of one embodiment of the inven- 
tion in which messages to the operator are dis- 
played and/or announced; 
5 Fig. 4 is a diagram of one embodiment of the inven- 
tion in which an image pickup device path is dis- 
played; 

Fig. 5 is a diagram of one embodiment of the inven- 
tion in which movement instructions are displayed 

10 to the operator; 

Fig. 6 is a flowchart outlining one embodiment of 
the method of this invention; 
Fig. 7 is a flowchart showing in greater detail the 
new frame position and orientation determining 

is step of Fig. 6; and 

Ftg. 8 shows two frames along a curve of the move- 
ment of an image recording apparatus. 

[0022] Figure 2 shows an image recording apparatus 
20 100 including an image pickup device 110, a processor 
120, a frame storage 130, a memory 140, an output con- 
troller 150 and an output device 160. The image pickup 
device 1 1 0 detects an image of the environment in which 
the image pickup device 110 is operating. This image, 
25 once recorded, is a frame. The image pickup device 1 1 0 
can be a video camera, still photo camera, digital cam- 
era, electronic camera, image editing machine, and the 
like. In one preferred embodiment, the image pickup de- 
vice 110 is a hand-held camera. Each recorded frame 
30 is input to the processor 1 20, which stores the frames 
in the frame storage 130. The frame storage 130 may 
include a videotape, photographic film, a CD-ROM, 
magnetic media, RAM, a hard drive, a floppy disk and 
disk drive, flash memory or the like. The frame I* is then 
35 sent to the processor 1 20. 

[0023] The processor 1 20 determines a position of the 
image pickup device 1 1 0 for each frame of the sequence 
of m frames obtained from the image pickup device 1 1 0. 
This may be done using any method by which the posi- 
40 tion of known points in the image frames are used to 
determine the relative position of the image pickup de- 
vice 110. The processor 120 then determines a subset 
of lines in each frame that pass through the position of 
the image pickup device 110. 
45 [0024] The image pickup device 1 1 0 then captures a 
new frame of image data and sends it to the processor 
120. The position and the orientation of the image pick- 
up device 110 in the new frame I is determined as out- 
lined below. 

so [0025] Determining the position and orientation of the 
image pickup device 110 is performed in real time so 
that feedback can be provided to the image recording 
device operator through the output device 160. For in- 
stance, the operator may fail to move the camera slowly 

55 or smoothly enough to mainatin registration. Registra- 
tion is the matching up of previous and current images. 
As shown in Figure 3, if registration is lost, the image 
recording apparatus 100 can provide an instruction to 
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the operator to return to the fiducial points. The fiducial 
points are those points whose exact locations are known 
a priori. This instruction may be either visual, auditory, 
or both. For instance, the image recording apparatus 
100 may display the message "Return to Fiducial s 
Points" on display 1 60, as shown in Figure 3, and/or may 
announce a message via the speaker 170. 
[0026] Likewise, it may be helpful to the operator to 
obtain a visual feedback of the current path that the op- 
erator has traversed. Accordingly, as shown in Figure 4, 10 
in one embodiment of the invention, the processor 1 20 
can output to the display 1 60 a visual display of dots 1 62 
depicting previously calculated image recording device 
positions, a curve showing the path traversed, or the 
like. is 
[0027] Additionally, as shown in Figure 5, the image 
recording apparatus 1 00 can provide movement instruc- 
tions on the display 160 to the operator These move- 
ment instructions instruct the operator to move to posi- 
tions which provide data for parts of the light field which 20 
are sparsely sampled. The image recording apparatus 
1 00 can also provide instructions that direct the operator 
away from areas for which registration is not yet sup- 
ported by previous frame data. 

[0028] The elements of the image recording appara- 2s 
tus 100 may be part of a single unit, such that all the 
elements are housed within a single housing 180, as 
shown in Figure 4, or may be distributed, as shown in 
Figure 3. For instance, the image recording apparatus 
1 00 may be a single, hand-held image recording appa- 30 
ratus 100 in which the image pickup device T10, the 
processor 120, the frame storage 130 and the output 
devices 160 and 170 are housed. Alternatively, the im- 
age pickup device 110 may be a video recording device 
connected via cables to a device, such as a computer 3$ 
or personal digital assistant, that houses the processor 
120, the memories 130 and 140 and the output devices 
1 60 and 1 70. Other combinations may be used to im- 
plement the invention, as will be readily apparent to 
those of ordinary skill in the art. 40 
[0029] Figure 6 is a flowchart outlining a preferred 
method of image recording according to this invention. 
As shown in Figure 6, the method starts with step S100. 
Then, in step S200, the image recording device cap- 
tures a sequence of m frames and stores them as a set 45 
of pixel lines in the frame storage 130. Next, in step 
S300, the camera positions, defined as the focal points 
"p" and ■p"\ having orientations "o" and "o", respective- 
ly, are determined. A simple way in which to produce a 
sequence of precisely registered frames with which to so 
begin is to use fiducial points, as discussed above. 
[0030] To obtain the fiducial points, many different 
methods may be employed. In a preferred embodiment, 
the fiducial points may be obtained by building a physi- 
cal coordinate frame, such as three distinct arrows at- ss 
tached at right angles. The points of the three arrows, 
the fiducial points, are easy to find in an image From 
the fiducial points, the position and orientation of the 



camera relative to the coordinate frame is determined. 
The image recording device can then move slowly about 
the fiducial points and capture the desired images. In 
this way. an initial light field is obtained. After the initial 
light field is obtained, it is no longer necessary to main- 
tain the fiducial points in the field of view of the image 
pickup device 110. 

[0031] Then, in step S400, for each of these frames, 
the image is identified with a subset of the set of pixel 
lines which have been stored in the light field. The se- 
quence of focal points forms a sequence "P" of VT 
points along a curve "C in the three-dimensional free 
space R 3 . The curve m C represents the movement of 
the camera within the environment. 
[0032] Next in step S500, the image pickup device 
110 obtains a new frame I, i.e., a new subset of lines 
through some new, unknown focal point p. The goal is 
to determine p and o by correlating the new frame with 
previous frames in the light field. 
[0033] It is important to observe that a line "I" through 
p and any other point p' in the set "P" is likely to have 
been captured already, by a previous frame I' corre- 
sponding to p' and stored in the light field. Specifically, 
I is captured in both frames I and I' if the image recording 
device is oriented at p' such that the projection of p onto 
the image plane of the previous frame I' falls within the 
previous frame I' and conversely, the image recording 
device is oriented at p so that the projection of p' onto 
the image plane of the new frame I falls within the new 
frame I Thus, knowing the correct p and o for the image 
recording device, the radiance assigned to 1 in I should 
be identical to the radiance assigned to I in I'. The values 
for p and o correspond to a prediction of the positions 
of the lines L = {I, ,...,l m } in I and of the radiance at those 
positions. When m is large, this gives a very precise test 
for the correct values of p and 0. 
[0034] Then, in step S600, the image recording de- 
vice position p and orientation o is determined for the 
new frame. In step S700, the control routine determines 
whether operation should be repeated for a new image 
frame. This may be done by either determining if the im- 
age pickup device 110 is still recording, determining if 
the image pickup device 110 has moved, and the like. If 
the operation is to be repeated, control returns to step 
S500. Otherwise, control continues to step S800 where 
the control routine stops. 

[0035] Figure 7 shows the position and orientation de- 
termination step S600 in greater detail. In particular, 
starling from step S600, control continues to step S61 0. 
In step S610, a rough estimate "e" of the position p and 
the orientation o are determined by identifying a small 
volume V of a space "S* which is likely to contain the 
image pickup device 110. The space S is a six-dimen- 
sional space of possible positions for the image pickup 
device 110. 

[0036] In step S61 0, to get the rough estimate e of the 
image recording device's position and orientation, some 
assumptions about the motion of the image pickup de- 
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vice 110 are made. If the image pickup device 110 does 
not move too fast, the position p cannot be too different 
from the previous position p\ If the image pickup device 
110 does not change its translational or rotational tra- 
jectory suddenly, the new position p should roughly ex- s 
trapolate some number of previous positions p\ These 
assumptions are likely to be good since the operator 
knows that he/she is trying to capture an image for later 
reconstruction of the environment and is likely to oper- 
ate the image pickup device 110 accordingly 
[0037] Traditional methods may also be used to ob- 
tain a rough estimate of the position p in the new frame 
I. For instance, the optical flow (interpreting the differ- 
ence between two images as a motion of a set of points 
of constant radiance) between the previous frame Tand 
the current frame t is not expensive to compute and can 
give an estimate of the motion of the image pickup de- 
vice 110. Likewise, the point correspondence method 
may be used to obtain a rough estimate of the image 
pickup device's position. 

[0038] Then, in step S620, the volume v is densely 
sampled to find a close match to the image. In step 
S620, given the rough estimate e in the space S, a small 
surrounding volume of S is sampled to find a point or 
pixel which best matches one or more lines in I to cor- 
responding lines stored in the light field. Next, in step 
S630, the error of the match is determined as a function 
"F" on points of the space S. The value of F at a point 
"s" in the space S is the difference between the predict- 
ed radiance of the portion of L overlapping I, when I is 
assumed to lie at s. and the observed values of the cor- 
responding points I. The differences are to allow for the 
existence of outliers, to account for random image 
noise, but to require fairly precise matching of non-out- 
liers, which should suffer only from quantization error. 
[0039] Lastly, in step S640, the result is optimized to 
provide a close fit. In step S640, the match is optimized 
to minimize the error in order to provide the closest cor- 
respondence. In step S650, control returns to step 
S700. 

[0040] Thus, in the present invention, the position of 
an image recording device can be accurately tracked by 
correlating radiance lines in a newly captured frame with 
radiance lines in a previously captured light field. 
[0041] These correlated frames may then be added 
to the existing light field in order to enlarge the light field. 
The addition of these frames to the light field may be 
performed dynamically as frames are being captured or 
may be performed off-line. In this way, the light field may 
be preprocessed off-line before use in further camera 
tracking and image recording. 

[0042] Furthermore, this process of adding frames to 
the initial light field may be repeated in order to obtain 
larger and large light fields. In this way, entire environ- 
ments may be captured for later reconstruction. 
[0043] Using the tracking of the present invention, it 
is not necessary to maintain fiducial points in the image 
frame once the initial light field has been obtained After 



the initial light field is obtained, tracking is accomplished 
by correlating radiance lines in a new frame with those 
in the light field. This allows (he image recording appa- 
ratus to be used in a much larger environment. 
[0044] As shown in Fig. 4, the image recording appa- 
ratus 100 is preferably implemented using a pro- 
grammed general purpose computer. However, the im- 
age recording apparatus 100 can also be implemented 
using a special purpose computer, a programmed mi- 
croprocessor or microcontroller and peripheral integrat- 
ed circuit elements, an ASIC or other integrated circuit, 
a hardware electronic or logic circuit such as a discrete 
element circuit, a programmable logic device such as a 
PLD, PLA, FPGA or PAL, or the like. In general, any de- 
vice on which a finite state machine capable of imple- 
menting the flowchart shown in Figures 6 and 7 can be 
used to implement the image recording of this invention. 
[0045] While the invention has been described as us- 
ing fiducial points to obtain an initial light field, it is not 
limited to such an arrangement. The initial light field may 
be obtained using a gantry type device, the point corre- 
spondence method, or any other method readily appar- 
ent to one of skill in the art. 
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Claims 

1. An image recording apparatus comprising: 

30 an image pickup device that acquires images 

of the surrounding environment, the images be- 
ing a sequence of frames; 
a frame storage that stores the sequence of 
frames acquired by the image pickup device; 
35 and 

a processor that determines a position and an 
orientation of the image pickup device based 
on the sequence of frames, wherein 
the frames comprise a plurality of lines and the 
40 processor determines the position of the image 

pickup device by correlating a radiance along 
at least one ol the plurality of lines in a current 
frame with the radiance along corresponding 
ones of the plurality of lines of at least one pre- 
45 vious frame in the sequence of frames. 

2. The image recording apparatus of claim 1 , wherein 
the processor determines an initial light field from 
the sequence of frames stored in the frame storage. 

50 

3. The image recording apparatus of claim 2, wherein 
the processor correlates a radiance along at least 
one of the plurality of lines in a current frame with 
the radiance along corresponding ones of the plu- 

55 rality of lines of the initial tight field to determine the 
position and the orientatton of the image ptckup de- 
vice 
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4. The image recording apparatus of claim 3, wherein, 
after the position and orientation of the image pick- 
up device is determined, the processor adds the 
current frame to the initial light field. 

5 

5. Apparatus according to any of the preceding claims, 
further comprising an interactive display that dis- 
plays a current frame through the image pickup de- 
vice and that provides instructions and information 

to an operator of the image recording apparatus. 10 

6. A method of determining a position and an orienta- 
I on of an image recording apparatus, comprising: 

recording a sequence of image frames; is 
determining the position and the orientation of 
i he image recording apparatus for each frame 
oi ihe sequence of frames; 
recording a new image frame; 
determining the position and orientation of the 20 
image recording apparatus for the new image 
tinme based on the position and orientation of 
the image recording apparatus for frames in the 
ocquence of image frames. 

25 

7 Tdo method of claim 6, wherein determining the po- 

* ttim and the orientation of the image recording ap- 
i..nriius for each frame of the sequence of frames 

* rvTipn«;os determining the position and the orien- 

i iiion of the image recording apparatus relative to 30 
t cJii- Ml points. 

8 1 1 u * method of claim 6 or claim 7, wherein determin- 

j r.e position and the orientation of the image re- 
« ''.iincj apparatus for the new image frame com- 35 
1 '< ■ ^ locating at least one line coincident in at least 
< < >< if n iqe frame of the sequence of image frames 
«m-.im the new image frame, the at least one located 
» wiving a radiance that is substantially identical 
nt ir.o hi least one image frame of the sequence of 40 
HtM je dames and in the new image frame. 

9. The method of any of claims 6 to B, further compris- 
ing cic.crmining an initial light field from the se- 
quence ot image frames, wherein the step of deter- 45 
min nq the position and orientation of the image re- 
cording apparatus for a new image frame includes 
tonelaimg a radiance along at least one line in the 
nuv*, iriinge frame with the radiance along a corre- 
sponding line of the initial light field. so 

10. The method of claim 9, wherein, after the position 
nnd orientation of the image recording apparatus is 
determined, the new image frame is added to the 
initial light field. 55 
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